Apparatus for analyzing cardiac events

ABSTRACT

An apparatus for detecting cardiac events in electrograms has a feature extraction unit that derives features of the cardiac events for discriminating among different types of cardiac events. A clustering unit groups cardiac events with similar features into respective clusters defined by predetermined cluster features. The feature extraction unit determines a feature vector describing waveform characteristics of cardiac events in the electrogram by a wavelet transform. The clustering unit determines the distance between the feature vector and corresponding cluster feature vectors in order to assign the cardiac event in question to that cluster which results in a minimum distance, providing that the minimum distance is less than a predetermined threshold value. A heart stimulator provided with such a cardiac event detecting apparatus detects the occurrence of an arrhythmia and appropriately controls a pulse stimulator to treat the arrhythmia.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to an apparatus for cardiac eventsdetected in electrograms, EGMs, and to a heart stimulator provided withsuch an apparatus.

In the following the expression “cardiac event” denotes thedepolarization phase in the cardiac cycle, which for atrial signals iscommonly known as P wave and for ventricular signals as R wave or QRScomplex.

2. Description of the Prior Art

In the field of devices for cardiac rhythm management (CRM), accuraterhythm classification is an increasingly important aspect. Pacemakersare primarily used to assist in bradycardia or when the electricalpropagation path is blocked, whereas the primary use of implantablecardioverter defibrillators (ICD) is to terminate ventriculararrhythmia, a life-threatening condition if not immediately treated. Inboth types of devices, accurate event classification of the electrogramsignal is needed for identifying, e.g., atrial and ventricularfibrillation in order to give appropriate therapy for the detectedarrhythmia. For pacemakers, this may necessitate changing the pacingmode in order to stabilize the ventricular rhythm during an episode ofatrial fibrillation. An ICD responds to ventricular fibrillation bygiving a defibrillating shock intended to terminate the fibrillation.

Ever increasing demands are put on both kinds of devices to betterhandle their primary task as well as to manage other tasks than thoseoriginally intended for. One such task may be, for an implantablemedical device, to identify atrial flutter in order to terminate it byatrial pacing or to defibrillate atrial fibrillation. Although it is nota life-threatening arrhythmia, atrial fibrillation is an inconvenienceto the patient and increases the risk for other diseases such as stroke.Atrial pacing may also be one way of terminating supraventriculartachycardias. An ICD specific task is to identify atrial fibrillation inorder to not mistake it for ventricular fibrillation and the risk ofgiving an unnecessary, and possibly harmful, defibrillation shock.Another, more general, utilization is to efficiently store rhythm datafor later analysis and evaluation, already done in modern ICD's. Bycollecting data, better knowledge of the evolution of cardiac diseasesand the functionality of the device can be obtained.

Clustering represents an important task within the classificationproblem where each individual event is assigned to a cluster of eventswith similar features. Labelling of the clusters, i.e., associating thecluster with a specific cardiac rhythm, completes the classificationsuch that the device can provide proper therapy when needed. However,certain constraints distinguish clustering in CRM devices fromclustering in general. In order to give immediate therapy, it requiresclustering to be done in real-time, thus excluding many iterativeclustering algorithms such as k-means clustering and competitivelearning. Various methods have recently been presented concerningclustering of signals from the surface electrocardiogram (ECG), basedon, e.g., self-organizing maps or fuzzy hybrid neural networks, see M.Lagerholm, C. Peterson, G. Braccini, L. Edenbrabdt, and L. Sörnmo,Clustering ECG complexes using Hermite functions and self-organizingmaps”, IEEE Trans. Biomed. Eng., vol. 47, pp. 838-848, July 2000, and S.Osowski and T. Linh, “ECG beat recognition using fuzzy hybrid neuralnetwork”, IEEE Trans. Biomed. Eng., vol. 48, pp. 1265-1271, November2001. However, most clustering algorithms used for ECG analysis arecomputationally rather complex and therefore unsuitable for implantableCRM devices. Furthermore, not much a priori morphologic information isassociated with the various rhythms in the electrogram (EGM); this is incontrast to the more well-defined ECG.

Previous work in the area of intracardiac event classification mainlyfocused on discrimination of a specific condition in order to discern,e.g., atrial fibrillation from other atrial tachyarrythmias, see A.Schoenwald, A. Sahakian, and S. Swiryn, “Discrimination of atrialfibrillation from regular atrial rhythms by spatial precision of localactivation direction”, IEEE Trans. Biomed. Eng., vol. 44, pp. 958-963,October 1997. Other applications involve discrimination of ventricularfrom supraventricular tachycardia, see L. Koyrakh, J. Gillberg, and N.Wood, “Wavelet based algorithms for EGM morphology discrimination forimplantable ICDs”, in Proc. Of Comp. In Card. (Piscataway, N.J., USA),pp. 343-346, IEEE, IEEE Press, 1999, and G. Grönefeld, B. Schulte, S.Hohnloser, H.-J. Trappe, T. Korte, C. Stellbrink, W. Jung, M. Meesmann,D. Böcker, D. Grosse—Meininghaus, J. Vogt, and J. Neuzner, “Morphologydiscrimination: A beat-to-beat algorithm for the discrimination ofventricular from supraventricular tachycardia by implantablecardioverter defibrillators”, J. Pacing Clin. Electrophysiol., vol. 24,pp. 1519-1524, October 2001. More general classification algorithms,which in turn involve training on individual patients, have been basedon analog neural networks or wavelet analysis for morphologicdiscrimination of arrhythmias.

PCT Application WO 97 39 681 describes a defibrillator control systemcomprising a pattern recognition system. The intracardiac electrogramsignal is digitised and delivered for feature selection into a selector.The feature selector outputs selected features to a trained classifierto provide information as to what group the produced signal should beclustered, e.g. ventricular tachycardia. The classifier outputs theclassified information for use for a therapeutic decision.

U.S. Pat. No. 5,271,411 discloses an ECG signal analysis and cardiacarrhythmia detection by extraction of features from a scalar signal. AQRS pattern vector is then transformed into features describing the QRSmorphology, viz. a QRS feature vector. A normal QRS complex isidentified based on the population of QRS complexes located withinclusters of QRS features within a feature space having a number ofdimensions equal to the number of extracted features. The extractedmorphology information is then used for judging whether a heartbeat isnormal or abnormal.

U.S. Pat. No. 5,638,823 describes non-invasively detecting of coronaryartery diseases. A wavelet transform is performed on an acoustic signalrepresenting one or more sound event caused by turbulence of bloodflowing in an artery to provide parameters for a feature vector. Thisfeature vector is used as one input to neural networks, the outputs ofwhich represent a diagnosis of coronary stenosis in a patient.

In Michael A. Unser et al, “Wavelet Applications in Signal and ImageProcessing IV”, Proceedings SPIE—The International Society for OpticalEngineering, 6-9 Aug. 1996, vol. 2825, part two of two parts, pp.812-821, a wavelet packet based compression scheme for single lead ECGsis disclosed, including QRS clustering and grouping of heart beats ofsimilar structures. For each heart beat detected, its QRS complex iscompared to templates of previously established groups. Point-by-pointdifferences are used as similarity measures. The current beat isassigned to the group whose template is most similar, providedpredetermined conditions are satisfied. Otherwise a new group is createdwith the current QRS complex used as the initial group template.

SUMMARY OF THE INVENTION

An object of the present invention is to provide a technique forseparation of cardiac rhythms in a reliable way on the basis ofelectrogram event clustering

The above object is achieved in accordance with the invention by anapparatus for analyzing cardiac events detected in electrogram (EGMs)having a feature extraction unit that derives features of the cardiacevents for discriminating among different types of detected cardiacevents, and a clustering unit that groups cardiac events with similarfeatures into a cluster, defined by predetermined cluster features,wherein the feature extraction unit determines a feature vector thatdescribes waveform characteristics of the cardiac event EGM signals bymeans of a wavelet transform, and wherein the clustering unit determinesa distance or distances between the feature vector and correspondingcluster feature vectors in order to assign the cardiac event in questionto the cluster that results in a minimum distance, as long as theminimum distance is less than a predetermined threshold value.

Certain arrhythmias are diagnosed immediately to give proper therapy,while, for others, it may be sufficient to record the rhythm for datacollection purposes. Thus, the classification problem, viz. to label therhythms based on clusters using clinical terms, may not always benecessary to implement.

In an embodiment of the apparatus according to the invention bothmorphologic and temporal data are considered for clustering. Morphologicfeatures are efficiently extracted by use of the dyadic wavelettransform after which the events are grouped by a leader-followerclustering embodiment. The event detection problem, based on the sametransform, is previously treated in Swedish patent application no.0103562-5.

According to another advantageous embodiment of the apparatus accordingto the invention an integrator is provided to integrate said distanceover a predetermined period of time. By integrating the distance over aperiod of time it is possible to distinguish irregular rhythms, likeatria! fibrillation, from regular rhythms. The integral total distancein the case of atrial fibrillation will be high whereas regular rhythmswill result in a lower total distance.

The invention also relates to a heart stimulator provided with theabove-mentioned apparatus for controlling the therapeutic stimulationdepending on arrhythmia detection.

DESCRIPTION OF THE DRAWINGS

FIGS. 1 a and 1 b illustrate impulse responses of a filter bank used forcardiac event detection in accordance with the present invention.

FIG. 2 is a flow chart of a clustering algorithm performed in theapparatus according to the present invention.

FIGS. 3 a and 3 b illustrate the computational complexity for differentclustering algorithms.

FIG. 4 a illustrates clustering performance in terms of probability forcorrect clustering of an event, as well as the probability of a dominantevent in the cluster, and FIG. 4 b illustrates a histogram spread of thenumber of clusters.

FIG. 5 a illustrates clustering performance as a function of a distancethreshold, and FIG. 5 b illustrates the number of clusters as a functionof the distance threshold.

FIG. 6 illustrates the clustering performance for one set of parameters.

FIGS. 7 a and 7 b are examples of the clustering result for a connectedEGM for three cases.

FIG. 8 is a flowchart describing the general functioning of anembodiment of an apparatus according to the invention.

FIG. 9 is a block diagram of a heart stimulator providing with a cardiacevent detecting apparatus in accordance with the present invention.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

P Wave Detection and Feature Extraction

A signal model assuming that the event waveform is composed of a linearcombination of representative signals is considered. The featureextraction problem is then to estimate the individual components of therepresentative signals since each morphology will have its own linearcombination. By using the dyadic wavelet transform, different widths ofthe two fundamental monophasic and biphasic waveforms are included inthe model at a low cost.

Feature Extraction

It is assumed that the QRS waveform is composed of a linear combinationof representative signals,H=[h ₁ . . . h _(p)]  (2)where each function, h_(j), j=1, . . . , P, is of size M×1. The onlyrestriction on H is that it must have full rank. Different morphologies,s(n), are modelled by the P×1 coefficient vector θ(n), with the linearmodels(n)=Hθ(n)  (3)where n is a temporal variable describing when the event occurs. Theobserved signal, x(n), is assumed to be modelled by,x(n)=Hθ(n)+w(n)  (4)where the M×1 noise vector w(n) is assumed to be zero-mean, white, andGaussian with variance

$\sigma\;{\frac{2}{\omega}.}$Consequently, x(n) is defined as

$\begin{matrix}{{x(n)} = \begin{bmatrix}{x(n)} \\\vdots \\{x\left( {n + M - 1} \right)}\end{bmatrix}} & (5)\end{matrix}$implying an event duration of M samples, beginning at n. The probabilitydensity function x(n) for a specific realization of θ(n), p(x(n); θ(n)),is thus given by

$\begin{matrix}{{p\left( {{x(n)};{\theta(n)}} \right)} = {\frac{1}{\left( {2{\Pi\sigma}\begin{matrix}2 \\\omega\end{matrix}} \right)\frac{M}{2}}{\exp\left\lbrack {{- \frac{1}{2\sigma\begin{matrix}2 \\\omega\end{matrix}}}\left( {{x(n)} - {H\;{\theta(n)}}} \right)^{T}\left( {{x(n)} - {H\;{\theta(n)}}} \right)} \right\rbrack}}} & (6)\end{matrix}$

In this model, a complete description of a QRS complex is provided bythe deterministic unknown parameter vector θ(n). The absence of a QRScomplex corresponds to the case when θ(n) is equal to 0, where 0 is theP×1 zero vector. In general, no a priori knowledge is available on θ(n),and therefore an estimate is required before detection can take place.Furthermore, only one event is assumed to take place within theobservation interval 0≦n≦N.

Filterbank Representation

The descriptive functions in H have been selected such that thefollowing three aspects have been taken into particular account:

-   -   1. the main morphologies of the QRS complex are mono- and/or        biphasic    -   2. the broad range of QRS complex durations, and    -   3. a low complexity implementation.

The wavelet transform is particularly suitable since it is a localtransform, i.e., it provides information about the local behaviour of asignal. One wavelet decomposition method which may be efficientlyimplemented is the dyadic wavelet transform. By careful selection of thefilters, a suitable filter bank including mono-and biphasic impulseresponses can be obtained. A symmetric lowpass filter, f(n), is usedrepeatedly in order to achieve proper frequency bands. This filter iscombined with one of two filters, g_(b)(n) or g_(m)(n), which togetherdefine the waveform morphology (the subindices b and m denote bi- andmonophasic, respectively). For the biphasic case, the recursion isexpressed as,

$\begin{matrix}{{{h_{1,b}(n)} = {g_{b}(n)}}{{h_{2,b}(n)} = {{f(n)}*{g_{b}\left( {2n} \right)}}}{{h_{3,b}(n)} = {{f(n)}*{f\left( {2n} \right)}*{g_{b}\left( {4n} \right)}}}\vdots{h_{q_{\max},b}(n)} = {{f(n)}*\mspace{14mu}\ldots\mspace{14mu}*{\int{\left( {2^{q_{{mas}^{- 3}}}n} \right)*{g_{b}\left( {2^{q_{\max^{- 1}}}n} \right)}}}}} & (7)\end{matrix}$in which the subindex q_(max) represents the maximum (coarsest) scale.

It is now possible to present an expression for-H which is composed ofone biphasic and one monophasic part,H=└{tilde over (H)}_(b) {tilde over (H)}_(m)┘  (8)

-   -   where the biphasic H_(b) is defined as

$\begin{matrix}{{H_{b} - \left\lbrack {h_{q_{\min},b}\mspace{14mu}\ldots\mspace{14mu} h_{q_{\max},b}} \right\rbrack} = \begin{bmatrix}{h_{q_{\min},b}(0)} & \ldots & {h_{q_{\max},b}(0)} \\\vdots & ⋰ & \vdots \\{h_{q_{\min},b}\left( {M - 1} \right)} & \ldots & {h_{q_{\max},b}\left( {M - 1} \right)}\end{bmatrix}} & (9)\end{matrix}$where the subindex q_(min) represents the minimum scale andq_(min)<q_(max). The monophasic matrix H_(m) is computed in acorresponding way. The reversed order of the columns in Ĥ, denoted with{tilde over (H)} in (8), is introduced in order to be consistent withthe model assumed in (4).

In order to mimic the desired mono- and biphasic waveforms with a lowcomplexity filter bank structure, short filters with small integercoefficients were used. In (7), the impulse response f(n) was chosen asa third order binomial function,

$\begin{matrix}{{F(z)} = {{\left( {1 + z^{- 1}} \right)^{L_{\underset{\_}{\underset{\_}{L = 2}}}}1} + {3z^{- 1}} + {3z^{- 2}} + z^{- 3}}} & (10)\end{matrix}$

-   -   where F(z) is the Z-transform of f(n). For the biphasic filter        bank, the filter g_(b)(n) was selected as the first order        difference,        g _(b)(n)=[−1 1]  (11)

The filter g_(m)(n) was chosen such that a compromise between therequirement of a DC gain equal to zero and an approximately monophasicimpulse response was achieved. A reduction of complexity results wheng_(b) (n) is reused,g _(m)(n)=g _(b)(n)*g _(b)(n)=[1−2 1]  (12)

For this particular choice of g_(m)(n) and by using Mallat's algorithm,it is possible to calculate both the biphasic and the monophasic filteroutput from each scale by using f(n) once and g_(b)(n) twice. The filterbank impulse responses are shown in FIGS. 1 a and 1 b. The filter bankincludes two orthogonal signal sets where the width of the signal varieswithin each set. In FIG. 1 a, h_(j,b) (n) is shown for j=2, . . . ,4from top to bottom, and in FIG. 1 b, the corresponding h_(j,m)(n) areshown.

ML Parameter Estimation

The unknown coefficient vector θ(n) can be estimated by using themaximum likelihood criterion according to,

$\begin{matrix}{{\hat{\theta}(n)} = {\arg_{\begin{matrix}\max \\{\theta{(n)}}\end{matrix}}{p\left( {{x(n)};{\theta(n)}} \right)}}} & (13)\end{matrix}$

However, {circumflex over (θ)}(n) is only of interest for those n forwhich the probability of an event, or, equivalently, for which thelikelihood ratio test function T(x(n)) is maximized,

$\begin{matrix}{\hat{n} = {\arg_{\begin{matrix}\max \\n\end{matrix}}{T\left( {x(n)} \right)}}} & (14)\end{matrix}$

For this case, T (x(n)) can be shown to be [14],T(x(n))=x(n)^(T) H(H ^(T) H)⁻¹ H ^(T) x(n)  (15)

-   -   for the case when σ_(ω) ² is assumed to be constant.

The optimal estimate {circumflex over (θ)} for the detected event at{circumflex over (n)} is thus expressed as

$\begin{matrix}{\hat{\theta} = {\arg_{\begin{matrix}\max \\{\theta{(\hat{n})}}\end{matrix}}{p\left( {{x\left( \hat{n} \right)};{\theta\left( \hat{n} \right)}} \right)}}} & (16)\end{matrix}$

The MLE of θ({circumflex over (n)})(n) is found by maximizingp(x({circumflex over (n)});θ({circumflex over (n)}), or equivalentlyminimizing the MSE,(x({circumflex over (n)})−Hθ({circumflex over (n)}))^(T)(x({circumflexover (n)})−Hθ({circumflex over (n)}))=x({circumflex over (n)})^(T)x({circumflex over (n)})+θ({circumflex over (n)})^(T) H ^(T)Hθ({circumflex over (n)})−2θ({circumflex over (n)})^(T) H ^(T)x({circumflex over (n)})  (17)

Derivation with respect to x({circumflex over (n)}) yields the optimumθ,{circumflex over (θ)}({circumflex over (n)})=(H ^(T) H)⁻¹ H ^(T)x({circumflex over (n)})=(H ^(T) H)⁻¹ H ^(T) x({circumflex over(n)})  (18)

By using the above formulation, it is also possible to derive ageneralized likelihood ratiodetector based on (15).

Rate

The central parameter for classification of cardiac arrhythmias is theheart rate. Most arrhythmias are defined in terms of heart rate,although sometimes with rather fuzzy limits. Consequently, rate shouldbe considered in order to improve performance. The RR interval, Δt, isdefined as the duration between two consecutive events,Δ{circumflex over (t)} _(k)=({circumflex over (n)} _(k) −{circumflexover (n)} _(k-1))Ts  (19)where {circumflex over (n)}_(k) and {circumflex over (n)}_(k-1) denotethe occurrence times of the events, and T_(S) denotes the samplingperiod.Leader-Follower Clustering

The choice of leader-follower clustering is based on a number offeatures which makes it suitable for the present invention, viz.

-   -   on-line processing (non-iterative), and    -   self-learning, i.e., no a priori knowledge of the of clusters is        needed.

The starting point is the assumption that an event is present for whichit should be decided whether it belongs to an already existing clusteror if a new cluster should be initiated. Since, no knowledge is a prioriavailable on which rhythms or morphologies to be expected, the chosenalgorithm must be self-learning. The leader-follower clusteringalgorithm is constituted by four quantities:

-   -   1. The event parameter vector, φ(n_(k)), containing the features        of the k:th event that occurs at time n_(k).    -   2. The cluster center μ_(i), and the covariance matrix Σ_(i);        that together define the i:th cluster. Since both μ_(i) and        Σ_(i) are unknown, they are replaced by their estimates        {circumflex over (μ)}_(i) and {circumflex over (Σ)}_(i);        respectivey.    -   3. The metric, d_(i) ², determine the distance between φ(n_(k))        and μ_(i); according to some suitable function.    -   4. A rule for adaptation of the cluster parameters for the        winning cluster.        Initialization of New Clusters

During run-time, a finite number of clusters exists which represent therhythms having appeared until present time. Thus, it is occasionallynecessary to initialize a new cluster when the existing ones do notsufficiently well fit the present event. When the distance functiond_(j) ² exceeds a certain threshold, η, it is more likely that the eventbelongs to a new cluster than to any of the existing clusters. Theselection of η is a tradeoff between cluster size and cluster resolutionin the sense that choosing a small η will result in many clusters withfew clustering errors. On the other hand, a large η results in fewclusters but in more errors.

The minimum distance between φ(n) and μ_(i); with respect to both i andn is

$\begin{matrix}{{{d\begin{matrix}2 \\\min\end{matrix}} = {{\underset{i,_{l}}{\min^{d_{i}^{2}}}{(1)\mspace{14mu} i}} = l}},\mspace{14mu}\ldots\mspace{14mu},{{l\mspace{14mu} l} = {{\hat{n}}_{k} - \frac{K}{2}}},\mspace{14mu}\ldots\mspace{14mu},{{\hat{n}}_{k} + \frac{K}{2}}} & (20)\end{matrix}$

-   -   over the search interval K. The corresponding minimum distance        indices are found as

$\begin{matrix}{\left\lbrack {i_{\min},n_{\min}} \right\rbrack = {\arg_{\begin{matrix}\min \\{i,l}\end{matrix}}{d_{j}^{2}(l)}}} & (21)\end{matrix}$

The decision rule based on the comparison of d_(min) ² and η isexpressed as

$\begin{matrix}{d_{\min^{2}}\left\{ \begin{matrix}{\rangle\eta\text{:}\mspace{14mu}{Initialize}\mspace{14mu}{new}\mspace{14mu}{cluster}} \\\left\langle {\eta\text{:}\mspace{14mu}{Assign}\mspace{14mu}{to}\mspace{14mu}{winning}\mspace{14mu}{cluster}} \right. \\\left\langle {{\rho\eta}\text{:}\mspace{14mu}{Update}\mspace{14mu}{winning}\mspace{14mu}{cluster}} \right.\end{matrix} \right.} & (22)\end{matrix}$

-   -   where 0<p<1. When the upper relation in (22) holds, a new        cluster is initialized by first increasing the number of        clusters by one, I=I+1, given that I<I_(max) where I_(max) is        the maximum number of clusters, and then initializing the new        cluster as,        {circumflex over (μ)}₁=Φ({circumflex over (n)} _(k))  (23)

On the other hand, if I=I_(max) the algorithm needs to discard one ofthe existing clusters. This can be done by elimination of, e.g., theoldest or the “smallest” cluster. By the term “smallest” cluster ismeant that cluster which most rarely is fitted with a detected cardiacevent.

For the middle and lower conditions in (22), the existing minimumdistance cluster i_(min) is selected as the winning cluster. However,only the lower condition results in a cluster parameter update. Thereason for including such a distinction is that only closely similarevents should be used for cluster updates in order to reducecontamination.

Mahalanobis Distance Function

The distance between the feature vector θ(n) and each cluster{circumflex over (μ)}₁, is defined as the Mahalanobis distance, which isa normalized Euclidean distance in the sense that it projects theparameter vector elements onto univariate dimensions by including theinverse covariance matrix Σ_(i) ⁻¹. Thus, a feature with a largervariance in φ(n) will be assigned a larger share of the hyperspacebefore normalization compared to that with a lower variance. Aconsequence of normalization is that the Mahalanobis distance works wellon correlated data since Σ_(i) ⁻¹ then acts as a decorrelator.

When searching for the minimum distance, a grid search over n isperformed. This grid search is necessary since it not only minimizes thedistance but also results in a more accurate fiducial point estimatethan what would be the case when only considering T(x({circumflex over(n)}_(k))). The minimum distance is thus found by a grid search withrespect to all existing clusters, i=1, . . . , I, and all featurevectors within the duration of an event I as defined in (20),

$\begin{matrix}{{d_{i}^{2}(l)} = {\left( {{\Phi(l)} - {\hat{\mu}}_{i}} \right)^{T}{{\hat{\Sigma}}_{i}^{- 1}\left( {{\Phi(l)} - {\hat{\mu}}_{i}} \right)}}} & (24)\end{matrix}$Reference Feature Adaptation

In order to track changes in the features of the different rhythms,adaptation of both μ_(i min) (k) and

Σ̂_(i  min )⁻¹(k)are desirable, here the event index k has been included for clarity. For{circumflex over (μ)}_(imin) (k), an exponentially updated average isused:μ_(i) _(min) (k)={circumflex over (μ)}_(i) _(min) (k−1)+γε(k)  (25)whereε(k)=Φ(n _(min))−{circumflex over (μ)}_(i) _(,om) (k−1)  (26)

The exponential update factor γ is confined to the interval 0<γ<1.

The inverse covariance matrix,

Σ̂_(i  min )⁻¹(k),is estimated by exponential averaging of the new cluster differencematrix ε(k)ε^(T)(k) using the update factor (1−α),

$\begin{matrix}{{{\hat{\Sigma}}_{i_{\min}}^{- 1}(k)} = \left( {{(\alpha)\hat{\Sigma}{i_{\min}\left( {k - 1} \right)}} + {\left( {1 - \alpha} \right){ɛ(k)}{ɛ^{T}(k)}}} \right)^{- 1}} & (27)\end{matrix}$

Using the matrix inversion lemmaA=B ⁻¹ +CD ⁻¹ C ^(T)  (28)A ⁻¹ =B−BC(D+C ^(T) BC)⁻¹ C ^(T) B  (29)and pairing the terms in (27) with the ones in (28),A={circumflex over (Σ)} _(imin)(k)  (30)B ⁻¹α{circumflex over (Σ)}_(imin)(k−1)  (31)C=ε(k)  (32)D ⁻¹=(1−α)  (33)The inverse {circumflex over (Σ)}_(i min) ⁻¹ (k−1) in (27) may becomputed without any matrix inversions as,

$\begin{matrix}{{{\hat{\Sigma}}_{i_{\min}}^{- 1}(k)} = {{\alpha^{- 1}{{\hat{\Sigma}}_{i_{\min}}^{- 1}\left( {k - 1} \right)}} - \frac{\alpha^{- 1}{{\hat{\Sigma}}_{i_{\min}}^{- 1}\left( {k - 1} \right)}{ɛ(k)}{ɛ^{T}(k)}\alpha^{- 1}{{\hat{\Sigma}}_{i_{\min}}^{- 1}\left( {k - 1} \right)}}{\left( {1 - \alpha} \right)^{- 1} + {{ɛ^{T}(k)}\alpha^{- 1}{{\hat{\Sigma}}_{i_{\min}}^{- 1}\left( {k - 1} \right)}{ɛ(k)}}}}} & (34)\end{matrix}$

By utilizing the matrix inversion lemma, the computational complexity ofthe operation is reduced from O(P³) to O(P²).

Algorithm Initialization

In leader-follower clustering, clusters are initialized as they becomeneeded. This feature is convenient since it does not automaticallyintroduce any unused clusters. However, it also puts demands on thealgorithm to be able to create new clusters when necessary, and also toterminate clusters either not used for long or with only a few events.Initially, the total number of clusters, I, is equal to one. Thealgorithm is initialized by assigning the parameter vector Φ({circumflexover (n)}_(i)), which maximizes the test statistic in (14) of the firstevent, to the first cluster, cf. (23),{circumflex over (μ)}₁=Φ({circumflex over (n)} ₁)  (35)

For the general case, Φ({circumflex over (n)}_(k)) is composed of asubset of the representative functions together with the precedingRR-interval,

$\begin{matrix}{{\Phi\left( {\hat{n}}_{k} \right)} = \begin{bmatrix}\frac{{\hat{\theta}}_{s}\left( {\hat{n}}_{k} \right)}{{{\hat{\theta}}_{s}\left( {\hat{n}}_{k} \right)}} \\{\Delta\; t_{k}}\end{bmatrix}} & (36)\end{matrix}$

-   -   where {circumflex over (θ)}_(s)({circumflex over (π)}_(k)) is a        subset of the most discriminating elements in {circumflex over        (θ)}_(s)({circumflex over (π)}_(k)).

It should be noted that an event is defined by its depolarization wave.Consequently, Δt is not included in the arrival time estimation of{circumflex over (n)}_(k), but instead computed afterwards. The timecontinuous notation of Δt_(k) is preferred since it results in asuitable magnitude similar to the normalized morphological informationin θ_(s).

The inverse correlation matrix estimate is initialized in the same wayfor all clusters; a simple solution is to set it equal to a scaling ofthe identity matrix I,

$\begin{matrix}{{\hat{\Sigma}}_{i}^{- 1} = {\delta\; I}} & (37)\end{matrix}$

-   -   where σ is a design parameter. The complete clustering algorithm        for organized events is presented in the flow chart in FIG. 2.        Reduced-Complexity Clustering

In order to develop a more efficient algorithm in terms of performanceversus power consumption and to evaluate the power consumption itself, asimple measure of the computational complexity can be used, namely, thetotal number of multiplications. This quantity represents a much morecomplex operation than do additions. In this algorithm, where mostoperations are of the nature “multiply-accumulate”, the number ofadditions is of the same order as the number of multiplications and maythus be neglected without significant loss of accuracy.

In order to reduce the computational complexity, focus is put onreducing the number of multiplications. The dominant contributions ofmultiplication operations are found in (24) and (34) which requireP(P+1) and P/2 (3P+5) multiplications, respectively, considering certainsymmetry properties of

Σ̂_(i)⁻¹.Furthermore, one division is required in (34). However, according to(20), (24) is performed IK times per event while (34) is performed onlyonce per event.

Based on the above performance figures, a few approximations can beidentified:

-   -   to use only the peak(s) in T(x({circumflex over (n)})) instead        of a complete grid search,    -   to use a simplified

Σ̂_(i)⁻¹,

-   -    and    -   to use a likelihood based search sequence over the clusters,        i.e., to start with the most likely cluster and to stop the        search if a sufficiently small distance is found.

Simplifying the grid search from spanning both samples and clusters in(20) to span over only clusters,

$\begin{matrix}{d_{\min}^{2} = {\min\limits_{i}{d_{i}^{2}\left( \hat{n} \right)}}} & (38)\end{matrix}$

-   -   the number of multiplications may be reduced by a factor K to        IP(P+1). Due to sensitivity in T(x({circumflex over (n)})), the        feature vectors resulting in the peaks for two different events        may differ significantly, this simplification is likely to        result in more clusters. A useful compromise may instead be to        use the coefficients from, e.g., the 3 largest peaks in        T(x({circumflex over (n)})) resulting in 3IP(P+1)        multiplications per event.

Since a cardiac event lasts for longer time than one sample thosesamples which give the filter coefficients which are most similar to thecluster reference are determined. A grid search over 40 msec ispreferably made. In this way coefficients of greatest importance locallyare determined. For this decision T(x(n)) is considered and samplesgenerating a peak are chosen, i.e. T(x(n−1))<T(x(n))>T(x(n+1)), sincethey indicate the probability for the presence of an event.

Another simplification, based on the approximation of orthogonalitybetween the different wavelet scales as well as the RR interval, is tosimplify

Σ̂_(i)⁻¹by only including its diagonal elements in the adaptation,

$\begin{matrix}{{\hat{\Sigma}}_{i_{\min}}^{- 1} \approx {{\alpha^{- 1}{\hat{\Sigma}}_{i_{\min}}^{- 1}} + {\left( {I - \alpha} \right){{diag}\left( {{ɛ(k)}{ɛ^{T}(k)}} \right)}}}} & (39)\end{matrix}$where diag(A) returns the diagonal elements of the square matrix A. Byusing this approximation, the number of multiplications used for theestimation of

Σ̂_(i)⁻¹is reduced to 3P. Additionally, the distance computation in (20) issimplified and may be reduced to 2IKP multiplications per event.

A reasonable assumption is often that successive events originate fromthe same rhythms. Considering this knowledge, it would be sufficient tocompute the distances for the previously selected cluster. In doing so,the number of multiplications in (38) are reduced even further.

Table 1 presents the different detector versions as defined by theirdistinguishing features and shows computational complexity for thedifferent versions of the clustering algorithm.

Features 1 Version Σ̂_(i  min )⁻¹ Search alg. Complexity C A FullInterval${{IKP}\left( {P + 1} \right)} + {\frac{P}{2}\left( {{3P} + 5} \right)}$B Diagonal Interval 2IKP + 3P C Full 3 peak${3{{IP}\left( {P + 1} \right)}} + {\frac{P}{2}\left( {{3P} + 5} \right)}$D Diagonal 3 peak 6IP + 3P

The total computational complexity, C, as reflected by the number ofmultiplications for the different algorithm versions, is presented inFIG. 3 as a function of 1. In FIG. 3 the solid line shows an algorithmversion A, dashed line a version B, dotted line a version C, and aversion D by dash-dotted line), for a feature vector with (a), P=4, and(b), P=7 elements.

Results

The results are obtained by studying the performance of the followingquantities:

-   -   algorithm versions, as presented in Table 1, and    -   noise tolerance, for noise-free signals and for signals with        background noise of 50 μV RMS.

The following parameter settings have been used (unless otherwisestated),α=1.05 γ=0.025 δ=50 K=40  (49)

It is noted that α⁻¹ is chosen to offer faster adaption than γ. Thereason for that is that the initial estimate

Σ̂_(i)⁻¹is likely to be less accurate than the initial estimate {circumflex over(μ)}_(i). Also, δ is chosen to have the same order of magnitude as thesteady state eigenvalues

Σ̂_(i)⁻¹.

It should be noted that the different algorithm versions are not fullycomparable in terms of performance for a specific η due to thedifferences in distance computation. In versions B and D, where adiagonal

Σ̂_(i_(min))⁻¹is used, the lack of non-diagonal information results in a nonorthogonaldistance which is larger than the orthogonal one. Since versions C and Dmake use of a limited search, the minimum distance found may differ fromthe global minimum distance for the event. Both these algorithmicdifferences imply an increase in clustering quality for a certain valueof η, however, at the expense of more introduced clusters.Evaluation Measures

The two main quantities evaluated are clustering performance andcomputational complexity, see FIG. 3; these two quantities are ingeneral conflicting. In the evaluation, a cluster is assigned to thatcardiac rhythm which contains the most events in the cluster. Thisrhythm is denoted as the dominant rhythm within the cluster. Withrespect to the dominant rhythm, the cluster is defined as a correctcluster, whereas it is erroneous to all other rhythms. If a rhythm isfound to be dominant in more than one cluster, such clusters are firstmerged in the performance evaluation. The number of events in thecorrect cluster which belongs to the i:th dominant rhythm is equal toN_(D)(i), while the number of events belonging to any other false rhythmin the cluster is equal to N_(F)(i). The number of events of thedominant rhythm which are not classified in a correct cluster, i.e.,missing, is equal to N_(M)(i).

The performance of the algorithm is evaluated in terms of probability ofa correct clustering of an event, P_(CC)(i), and probability of adominant event in a cluster, P_(DE)(i). The first parameter is expressedas the share of correctly clustered events within a rhythm and is, usingthe above parameters, defined by

$\begin{matrix}{{{Pcc}(i)} = \frac{N_{D}(i)}{{N_{D}(i)} + {N_{M}(i)}}} & (41)\end{matrix}$

The second parameter may be expressed as the share of dominant rhythmevents within a cluster, and is defined as

$\begin{matrix}{{P_{DE}(i)} = \frac{N_{D}(i)}{{N_{D}(i)} + {N_{M}(i)}}} & (42)\end{matrix}$

For the case when a rhythm completely lacks a correct cluster, P_(DE)(i)is undefined; the rhythm is then excluded from subsequent statisticalcomputations. Averaging P_(CC)(i) and P_(DE)(i) over all clustersresults in the global performance measures P_(CC) and P_(DE),respectively.

It should be pointed out that P_(CC)(i) and P_(DE)(i) reach theirmaximal value of 1 when η is sufficiently small such that the number ofclusters equals the total number of beats evaluated. This is, of course,a highly undesirable solution although performance, as expressed in (41)and (42), will be excellent. For this reason, the total number ofclusters, I, is a crucial parameter to be considered. Here, the averageI is used together with the minimum and maximum number of clusters for acase, I_(min), and I_(max), respectively.

The power consumption of the algorithm is an important parameter whichdetermines the pacemaker life span. In this study, power consumption isapproximated by the computational complexity defined above as the totalnumber of multiplications. As shown, the computational complexitydepends mainly on three parameters: the feature vector length, P, thesearch interval, K and the cluster size, I.

Noise-Free Signal Clustering Performance

FIGS. 4 a and 4 g illustrate clustering performance in terms of P_(DE)and P_(CC), see FIG. 4 a, and I_(min), Ī and I_(max), see FIG. 4 (b).Thus in FIG. 4 a clustering performance is shown as depending on Ī forP_(CC) (dark bars) and P_(DE) (bright bars) for noise-free signals. InFIG. 4 b, the spread between the different cases is shown as, from leftto right, I_(min), I and I_(max). The presented algorithm versions arefound in Table 1 and three values of I have been chosen for comparison;3, 4 and 5. Versions A and B perform similarly for all three cases, andachieve P_(DE)=1 and P_(CC)=1 for Ī=4 and Ī=5, respectively, by creatingan acceptable number of clusters. However, it can be seen from FIG. 4 bthat a large difference in the number of initialized clusters betweendifferent cases is present. For version B, a slight increase in both Īand I_(max) is observed for Ī=4. However, this increase is more due toan unfortunate step-like behaviour in the results for the given Ī thanfor any significant decrease in performance. The values of η used inFIGS. 4 a and 4 b for the different versions are shown in Table 2.

TABLE 2 Values of η used in FIG. 4. Algorithm version Ī A B C D 5 3.64.2 5.4 5.8 4 4.6 4.8 7.0 7.2 3 9.6 9.6 12.0 10.8

Versions C and D perform slightly worse than do versions A and B.Contrary to the latter ones, neither version achieves P_(DE)=1 orP_(CC)=1 for the presented Ī, see FIG. 4 a.

FIG. 5 a shows the clustering performance for version A in detail andits dependence on η. The most noteworthy result is that both P_(DE)=1and P_(CC)=1 for η<7. It is also clear that clustering performancedeteriorates rapidly for η>10. The increase in P_(CC) for very highvalues of η is due to that not all rhythms are allocated to a dominantcluster and are therefore disregarded in the performance computation.

In FIG. 5 b, I_(min), Ī and I_(max) are presented as a function of theclustering threshold. Thus FIG. 5 a shows clustering performance interms of P_(DE) (solid line) and P_(CC) (dashed line) as a function of ηfor noise-free signals, and FIG. 5( b) the corresponding I_(min), Ī andI_(max). For η<3, the number of initiated clusters increases rapidlywith decreasing rI. Within the interval 3<η<7, all rhythms initiate atleast one cluster, while for η>7, this is not true for all cases.Removing the “worst case”, the above is true for the other cases forη<10.

The clustering performance is exemplified for one set of parameters inFIG. 6 using η=7. Correct clustering with minimal number of clusters isachieved for two cases while an extra cluster is initialized for twocases. The reason for the extra cluster is that, for case 2, thetemporal search interval is chosen too small for the third morphologyfrom the left resulting in an extra cluster. For case 5 an actualdifference in morphologies both on the up and down slopes of thedominant peak is discernible between the first and fourth clusters fromthe left.

Clustering for a concatenated electrogram is presented for case 3 inFIG. 7, where three distinct rhythm classes can be discerned, viz.normal sinus rhythm followed by supraventricular tachycardia and atrialflutter. The EGM is shown together with the clusters, represented by o,x and +, respectively, assigned for each event. The different rhythmsresult in three clusters.

FIG. 8 is a flow chart illustrating the function in broad outline of anembodiment of the apparatus according to the invention. At block 2cardiac event features are extracted in the form of waveletcoefficients, and the event is detected, at 4, 6. At block 8 is checkedwhether the detected event is member of a labelled cluster. If so, theevent is added to a class, at 10, and actions associated with that classare performed, at 12.

If the event is not member of a labelled class it is checked if it is amember of an existing cluster, at 14. If so, the event is added to aclass, at 16, and it is checked if it is possible to label the cluster,18, and if so the cluster is labelled, at 20.

If the event is not a member of an existing cluster, block 14, a newcluster is created as described above fitted to the detected cardiacevent, at 22.

Thus according to the invention clustering events in the EGM isperformed for use in implantable CRM devices, like heart stimulators.The invention is based on feature extraction in the wavelet domainwhereupon the features are clustered based on the Mahalanobis distancecriterion. According to advantageous embodiments of the apparatusaccording to the invention simplifications of the technique is proposedin order to reduce computational complexity to obtain a moreimplementationally feasible solution.

If the apparatus according to the invention is to be used for longerperiods of data analysis, large clusters, although old, may be desirableto be kept in some way, while the oldest cluster may be selected to beremoved if the application is based on shorter time frames. Also, due tothe short data lengths, testing of such algorithms would be of limitedvalue.

By combining detector/clusterer with labelling rules of a classifier acomplete detector/classifier is obtained with the possibility to moreaccurate therapy.

The labelling need not be done in real time and probably more than oneevent will be needed to label a cluster. Once the cluster has beenlabelled using clinical terms, the actions associated with theparticular class will be carried out immediately, i.e. in real time.Thus, the rules needed to label the cluster are not used in identifyingthe event itself.

The rules used to label the clusters are based on characteristics of thedifferent possible events. Instead of the exact rules, thecharacteristics are consequently described.

FIG. 9 is a block diagram of an embodiment of an implantable heartstimulator provided with the apparatus according to the invention.Electrodes 30, 32 implanted in the heart 34 of a patient are connectedby a lead 36 to an AID converter 40, via a switch 38 serving asovervoltage protection for the A/D converter 40. In the A/D converter 40the signal is A/D converted and the digital signal is supplied to awavelet detector 42.

The detector 42 decides whether a cardiac event is present or not asdescribed earlier. Wavelet coefficients are calculated as well.Parameters of the detector 42 are programmable from the stimulatormicroprocessor 44. At the detection the coefficients and the RRinformation are forwarded to the clusterer 46 in which it is determinedto which cluster the detected cardiac event belongs, as describedpreviously. The clusterer 46 is preferably of a leader-follower type andalso the cluster parameters are programmable from the microprocessor 44.

By the microprocessor 44 suitable therapy is decided depending onassigned cluster for the detected event and possible a priori knowledgeabout arrhythmia associated with the cluster in question. Thus it ispossible to distinguish e.g. ventricular tachycardia from a sinustachycardia by comparing the parameters with a known normal sinusrhythm. Parameters of the sinus tachycardia are then supposed to besimilar to those of a normal sinus rhythm, whereas parameters ofventricular tachycardia differ significantly.

As an alternative the decision rules can be trained from a number ofrhythms and the resulting rules are then used on test data, see WeichaoXu et al., “New Bayesian Discriminator for Detection of AtrialTachyarrhythmias”, DOI:10.1161/01.CIR.0000012349.14270.54, pp.1472-1479, January 2002. It is then possible to decide that a certainposition of the cluster indicates e.g. a sinus rhythm, etc. Also thistechnique can be based on analysis of the feature vector for a cluster,and it is possible to decide if the beat is broad or narrow, large orsmall, or if the rhythm is regular or irregular.

The stimulator in FIG. 9 also includes a pulse unit 48 with associatedbattery 50 for delivery of stimulation pulses to the patient's heart 34depending on the clustering evaluation.

The implantable stimulator shown in FIG. 9 includes telemetry means 52with antenna 54 for communication with external equipment, like aprogrammer.

Although modifications and changes may be suggested by those skilled inthe art, it is the intention of the inventor to embody within the patentwarranted hereon all changes and modifications as reasonably andproperly come within the scope of his contribution to the art.

1. An apparatus for analyzing cardiac events, comprising: a featureextraction unit, supplied with an electrocardiogram, said featureextraction unit configured for deriving features of cardiac events fromsaid electrocardiogram and determining a feature vector, by a wavelettransform operating on said electrogram, describing waveformcharacteristics of cardiac events in said electrogram; and a clusteringunit provided with said feature vector, said clustering unit configuredfor determining a distance between said feature vector and correspondingcluster feature vectors and assigning a cardiac event in saidelectrogram to a particular cluster that results in a minimum distance,as long as said minimum distance is less than a predetermined thresholdvalue, and generating an output signal that identifies the particularcluster to which the cardiac event has been assigned.
 2. An apparatus asclaimed in claim 1 wherein each of said clusters is defined by a clustercenter μ_(i) and a co-variance matrix Σ_(i) for the cluster features ofthat cluster, and wherein said clustering unit determines a distancefunction D2 between each event feature vector and said cluster centerμ_(i).
 3. An apparatus as claimed in claim 2 wherein said clusteringunit calculates said distance using Mahalanobis distance criterion. 4.An apparatus as claimed in claim 2 wherein said clustering unitdetermines said minimum distance by a grid search over a duration of thecardiac event.
 5. An apparatus as claimed in claim 1 comprising anintegrator that integrates said distance over a predetermined period oftime.
 6. An apparatus as claimed in claim 1 wherein said clustering unitupdates said cluster feature according to a predetermined rule dependenton said minimum distance.
 7. An apparatus as claimed in claim 1 whereinsaid clustering unit generates a new cluster if said minimum distanceexceeds said predetermined threshold value, by setting features for saidnew cluster equal to the event features of the cardiac event thatresulted in said minimum distance exceeding said predetermined thresholdvalue.
 8. An apparatus as claimed in claim 1 wherein said clusteringunit terminates clusters that fail to have a predetermined number ofcardiac events grouped therein within a predetermined time period.
 9. Anapparatus as claimed in claim 1 wherein said clustering unit performs alikelihood-based search sequence over said clusters to determine saidminimum distance.
 10. An apparatus as claimed in claim 1 wherein saidclustering unit determines said minimum distance by a grid search onlyover clusters in which cardiac events have been grouped within aduration of the cardiac event.
 11. An apparatus as claimed in claim 1wherein said clustering unit calculates a distance of a cardiac event inquestion from a cluster in which a cardiac event was previously grouped.12. An apparatus as claimed in claim 1 comprising a classifier thatassociates said clusters respectively with different cardiac rhythmsaccording to predetermined rules.
 13. A heart stimulator comprising: apulse generator configured to interact with a subject to deliverstimulation pulses to the subject; an apparatus that analyzes cardiacevents comprising a feature extraction unit, supplied with anelectrocardiogram, said feature extraction unit configured for derivingfeatures of cardiac events from said electrocardiogram and determining afeature vector, by a wavelet transform operating on said electrogram,describing waveform characteristics of cardiac events in saidelectrogram, and a clustering unit provided with said feature vector,said clustering unit configured for determining a distance between saidfeature vector and corresponding cluster feature vectors and assigning acardiac event in said electrogram to a particular cluster that resultsin a minimum distance, as long as said minimum distance is less than apredetermined threshold value; and an arrhythmia detection and controlunit connected to said pulse generator that controls emission ofstimulation pulses from said pulse generator dependent on detection ofan arrhythmia dependent on the cluster in which the cardiac event isgrouped.